ci: deterministic check for studio/frontend dep removals#5478
Conversation
Adds a CI gate that catches the common foot-gun: a dep dropped from
studio/frontend/package.json that something in src/ still imports.
scripts/check_frontend_dep_removal.py
Diffs package.json against a git base ref, collects every package
no longer declared, and for each one:
1. Greps the entire repo for any usage pattern (static / dynamic /
side-effect imports, require, CSS @import, HTML script/link
src, new URL(), triple-slash references, template literals,
bare quoted strings in JS-like files).
2. Resolves whether the package would still install by BFS'ing
the dep graph in the new lockfile starting from the new
package.json's declared deps (so a stale lockfile does not
give false OK-via-transitive results).
3. Distinguishes top-level node_modules/<name> from nested copies
under other packages. Bare src/ imports only resolve to the
top-level path.
4. Pip-installed playwright references are filtered, so removing
the npm playwright (CI uses the pip one) is reported correctly.
Additional hygiene checks (warnings, fail with --strict):
- lockfile <root> dep map matches package.json (catches drift).
- @types/X is not orphaned when X is no longer declared.
- No src/ import points at a package not declared in any field.
tests/studio/test_frontend_dep_removal.py
24 deterministic cases. Each patches a copy of the head
package.json, runs the script, and asserts (exit status,
reported FAIL list). Covers:
- Genuinely-breaking removals: next-themes, @xyflow/react,
@huggingface/hub, dexie, motion, canvas-confetti, recharts,
node-forge, mammoth, unpdf.
- Safe-via-transitive removals: katex, clsx, react,
@radix-ui/react-slot, zustand, tailwind-merge, remark-gfm,
date-fns, js-yaml, @tauri-apps/api.
- Mixed multi-removal failing on the unsafe entries only.
- Non-existent / not-in-base names (no-op).
- Move from deps to devDeps (not a removal).
.github/workflows/studio-frontend-ci.yml
Runs the checker on pull_request events against
origin/${{ github.base_ref }}, plus the edge-case suite.
classify() now catches sneaky shapes that an earlier line-only scan
would miss:
- multi-line `import { a, b } from "pkg"` and the same shape for
`export { ... } from "pkg"` / `export * from "pkg"` /
`export type ... from "pkg"`.
- JSDoc `@import("pkg")` references.
- Word-boundary fix so `foo` no longer matches `foobar` (subpath gate:
after the package name we require closing quote or `/`).
- Negative-lookbehind on `(?<!@)\bimport\b` so CSS `@import "X"` is
classified as css_import, not side_effect_import.
find_usage() now feeds an 8-line window (4 above / 4 below the grep
hit) into classify() so multi-line import statements are picked up
even though the initial grep is line-based.
tests/studio/test_frontend_dep_removal.py now exercises three suites:
- 24 edge cases: subprocess-driven, full-pipeline.
- 28 classify() unit cases: direct function call against hand-crafted
snippets. Covers static / side-effect / dynamic / require /
css_import / html_script / html_link / re_export (4 variants) /
template_literal / new_url / tsc_triple_slash / jsdoc_import /
string_literal, plus false-positive guards (substring collision,
plain-text comments, URL path tails, Python files, markdown).
- 12 adversarial cases: write synthetic files under
studio/frontend/src/__dep_check_adversarial__/, run the full
script, then clean up. Confirms multi-line imports, re-exports,
JSDoc @import, new URL, dynamic imports all FAIL when the
underlying package is removed.
Current total: 64 / 64 cases pass.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 231d4f12cf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| working-directory: ${{ github.workspace }} | ||
| run: | | ||
| python3 scripts/check_frontend_dep_removal.py \ | ||
| --base "origin/${{ github.base_ref }}" |
There was a problem hiding this comment.
Fetch the PR base before diffing dependencies
On pull_request runs, the checkout at the top of this job uses actions/checkout's default shallow fetch, which only fetches the triggering ref/commit, so origin/${{ github.base_ref }} is not guaranteed to exist in the workspace. In that common PR context, this command makes git show origin/main:studio/frontend/package.json fail inside read_pkg_at(), causing the new safety check to exit 2 and block frontend CI before typecheck/build even run; fetch the base ref/history or use github.event.pull_request.base.sha after fetching it.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request introduces a new utility script, scripts/check_frontend_dep_removal.py, and a corresponding test suite to prevent breaking changes when removing npm dependencies from the frontend. The script verifies that removed packages are no longer referenced in the codebase and are not required transitively. Review feedback identifies duplicated logic in the main function regarding lockfile handling, suggests explicitly specifying UTF-8 encoding for file reads to ensure cross-platform consistency, and recommends removing the unused lock_resolvable function.
| for file, lineno, content in rows: | ||
| if pkg == "playwright" and PIP_PLAYWRIGHT.search(content): | ||
| continue | ||
| # Try the single-line classify first. |
There was a problem hiding this comment.
This block contains duplicated code. The line head_lock = read_pkg_file(Path(args.head_lock)) is executed regardless of the condition. Additionally, the --base-lock argument is defined but its value is never used, which can be misleading.
You can simplify this to a single line. If the --base-lock argument is indeed unnecessary, consider removing it from the argument parser as well.
head_lock = read_pkg_file(Path(args.head_lock))References
- When a condition or calculated value is used across multiple conditional branches, compute it once and reuse the result to ensure consistency and improve maintainability.
| res = subprocess.run( | ||
| cmd, | ||
| cwd = cwd or REPO_ROOT, | ||
| stdout = subprocess.PIPE, |
There was a problem hiding this comment.
It's a good practice to explicitly specify the encoding when reading text files to avoid issues on different platforms. Since package.json and package-lock.json files are expected to be UTF-8, please add encoding="utf-8".
| res = subprocess.run( | |
| cmd, | |
| cwd = cwd or REPO_ROOT, | |
| stdout = subprocess.PIPE, | |
| def read_pkg_file(path: Path) -> dict: | |
| if not path.exists(): | |
| return {} | |
| return json.loads(path.read_text(encoding="utf-8")) |
| if not out.strip(): | ||
| return {} | ||
| return json.loads(out) | ||
|
|
||
|
|
||
| def read_pkg_file(path: Path) -> dict: | ||
| if not path.exists(): | ||
| return {} | ||
| return json.loads(path.read_text()) | ||
|
|
||
|
|
Catches the last common false-negative: removing a package whose
bin is only referenced through `package.json` scripts (e.g. dropping
typescript while `"build": "tsc -b && vite build"` calls tsc).
Cross-checked the patterns Vercel/Next.js, Vite, and TanStack use
in their own manifests; the bin/scripts pairing is the one
consumer-side pattern dep checkers commonly miss.
How it works:
- Build a bin-to-package map from each lockfile entry's `bin`
field. The map is global so a stale lockfile still resolves
bins from packages about to be pruned.
- Tokenize each script value, splitting on `&&`, `||`, `;`, `|`.
Strip env-var assignments and `npx / pnpx / yarn / pnpm / bunx`
prefixes, plus `./node_modules/.bin/` and `node_modules/.bin/`
path prefixes. Look up the leading token in the bin map.
- Hits are reported as `script_bin` and feed the same reachability
gate as source imports. A bin still installed transitively
(e.g. vite via @vitejs/plugin-react peer) is OK-via-transitive;
an orphaned bin is FAIL.
Test additions:
- 5 new edge cases: removing vite, typescript, eslint, @biomejs/biome,
and (@biomejs/biome + @vitejs/plugin-react) together. Correctly
flags @biomejs/biome and the combo as FAIL while vite / typescript
/ eslint are kept by peers.
- 8 new classify() unit cases: TypeScript ambient `declare module`,
namespace imports, combined default+named, default-as-named,
re-export default (4 forms), `.then()` dynamic imports without
await, and TypeScript `import()` in type position.
Current total: 29 edge + 36 classify-unit + 12 adversarial = 77 / 77.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 17705256c3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if args.base_lock: | ||
| head_lock = read_pkg_file(Path(args.head_lock)) | ||
| else: | ||
| head_lock = read_pkg_file(Path(args.head_lock)) |
There was a problem hiding this comment.
Use --base-lock when the override is provided
The --base-lock flag is parsed and documented, but this branch reads args.head_lock in both cases, so providing --base-lock has no effect. That breaks the advertised CLI contract for offline/synthetic comparisons and can silently produce results from the wrong lockfile when users think they are overriding the base lock input.
Useful? React with 👍 / 👎.
| python3 scripts/check_frontend_dep_removal.py \ | ||
| --base "origin/${{ github.base_ref }}" | ||
| python3 tests/studio/test_frontend_dep_removal.py |
There was a problem hiding this comment.
Include checker files in PR path triggers
This new CI step executes scripts/check_frontend_dep_removal.py and tests/studio/test_frontend_dep_removal.py, but the workflow's pull_request.paths filter only includes studio/frontend/** and the workflow file itself. As a result, PRs that modify either checker file can bypass this job entirely, allowing regressions in the new safety gate to merge without running its own CI.
Useful? React with 👍 / 👎.
After surveying package.json patterns in 10+ popular repos (React,
Vue/Svelte/Astro/Next.js, Vite, Storybook, TanStack/Query, Tailwind,
ESLint, TypeScript, Prettier, SvelteKit), several config fields in
package.json itself can reference packages by string. My checker
filtered all of package.json out of the string_literal fallback,
so removing a package that is only referenced from one of these
fields was a false negative.
Now covered (new pkg_json_field kind):
- overrides / resolutions / pnpm.overrides keys
- pnpm.patchedDependencies keys
- peerDependenciesMeta keys
- prettier: "@my/prettier-config" string
- eslintConfig.extends (string or array)
- stylelint.extends / stylelint.plugins
- babel.presets / babel.plugins
- jest.preset / jest.setupFiles / jest.transform
- commitlint.extends
- renovate.extends
- remarkConfig.plugins
- any other tool config field whose strings/keys equal the pkg
name or `pkg/subpath`
False-positive guards (do not flag string values inside):
- browserslist (browser queries)
- keywords (free-form strings)
- engines / engineStrict / packageManager / volta (version pins)
- files / directories / publishConfig (paths)
- workspaces (paths/globs)
- main / module / browser / types / typings / exports / imports /
bin / man (author-side fields)
- scripts (already handled separately via scripts_bin_refs)
- name / version / description / author / repository / homepage etc.
Test additions: new PkgFieldCase suite with 19 cases covering each
tool config field, subpath references, and the 5 false-positive
guards. Combined with the existing 29 edge / 36 classify / 12
adversarial cases, the suite is 96 / 96.
Adds an opt-in dead-dep enumeration to the existing safety check.
Iterates every package declared in studio/frontend/package.json
(all four dep fields combined) and reports each as one of:
used at least one detected reference -- in src/, a
config file, package.json scripts (bin), a
package.json tool-config field (overrides /
prettier / eslintConfig / stylelint / babel /
jest / commitlint / renovate / etc.), or
tsconfig.compilerOptions.types
unused no detected reference anywhere
type_pkg_kept @types/X where X is still declared (or X = node,
always implicit)
type_pkg_orphan @types/X where X is no longer declared --
candidate for removal alongside X
Wiring:
- New CLI flag `--enumerate-dead` (off by default).
- CI workflow now passes `--enumerate-dead` so the report shows on
every PR run; the report is informational unless `--strict` is
also set.
- With `--strict`, unused / type_pkg_orphan entries fail the run.
Tests:
- 5 new EnumCase scenarios:
E01 fake dep with no usage -> reported unused
E02 fake dep imported by a synthetic src file -> reported used
E03 fake dep referenced only in overrides -> reported used
E04 @types/X paired with X (also imported) -> kept
E05 @types/X without X -> orphan
Running the new flag against the current main reproduces exactly the
11 deps PR #5477 removed, validating the heuristic end to end.
Current total: 29 edge + 36 classify + 12 adversarial + 19 pkg-json
field + 5 enumeration = 101 / 101.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2660f408ec
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if not path.exists(): | ||
| return {} |
There was a problem hiding this comment.
Treat missing head lockfile as invocation error
read_pkg_file() returns {} when the --head-lock path is missing, but main() never validates that case before proceeding. That makes the checker silently run without lock metadata and can report false PASS results for removals that are only referenced through npm script bins (for example, removing @biomejs/biome with a bad --head-lock path returns 0 even though biome scripts break). Since the CLI contract says missing files should be an invocation error, this should exit 2 instead of continuing with an empty lock graph.
Useful? React with 👍 / 👎.
actions/checkout uses fetch-depth: 1 by default, so when the dependency removal check ran `git show origin/main:.../package.json` the ref wasn't available locally and the script exited 2 with "could not read base package.json at origin/main:...". Fetch the single base commit before invoking the check so the git-show lookup resolves. --depth=1 keeps the extra fetch cheap.
Five issues flagged across gemini and codex:
* --base-lock argparse arg was defined and advertised in the
docstring, but main() always read args.head_lock in both branches
-- the flag did nothing. Dropped the dead arg and the misleading
docstring line; the lockfile-reachability analysis only needs the
head lockfile.
* lock_resolvable() was defined but never called. Removed.
* read_pkg_file() did not specify an encoding for read_text().
Added encoding="utf-8" for cross-platform stability.
* read_pkg_file() returned {} when the path did not exist, so a
bad --head-lock value silently bypassed the reachability checks
(false PASS for removals that resolve through npm script bins).
main() now exits 2 with a clear message when the head lockfile
is missing, matching the existing behavior for the head pkg.
* studio-frontend-ci.yml pull_request paths filter only matched
studio/frontend/** and the workflow file, so PRs that modified
the checker script or its test could skip this job. Added both
files to the trigger.
|
Addressing the codex + gemini review:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bdf6f84844
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| print() | ||
|
|
||
| reachable_paths = reachable_from_head(head_pkg, head_lock) if head_lock else set() | ||
| bin_to_pkg = build_bin_to_pkg(head_lock) if head_lock else {} |
There was a problem hiding this comment.
Track script bins from the base lockfile
When a PR removes a package that is used only through an npm script and updates package-lock.json at the same time, building bin_to_pkg from the head lockfile drops the removed package's bin metadata before scripts_bin_refs() runs. For example, removing @biomejs/biome from both package.json and the lockfile leaves biome:check/biome:fix pointing at a missing binary, but this checker reports [OK] @biomejs/biome because the head lock no longer contains the biome -> @biomejs/biome mapping. This lets the new safety gate miss broken npm scripts for properly updated removals.
Useful? React with 👍 / 👎.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new script, check_frontend_dep_removal.py, along with a comprehensive test suite to prevent breaking changes when removing npm dependencies. The script identifies removed packages by diffing package.json and verifies they are no longer referenced in the codebase or remain resolvable transitively. Review feedback suggests several improvements to the script's robustness and efficiency, including expanding the import detection regex to cover dynamic imports and require calls, updating the list of Node.js builtins and internal aliases to prevent false positives, and optimizing the repository scanning process to avoid redundant file system traversals.
| for f in DEP_FIELDS: | ||
| decl.update((head_pkg.get(f) or {}).keys()) | ||
| # Also: anything tsconfig path-aliases (just '@/...' here) is internal | ||
| pattern = r"(?:from|import)\s+['\"]([^'\"./][^'\"]*)['\"]" |
There was a problem hiding this comment.
The regex pattern used for the reverse check (finding imports without declarations) only identifies static import and from statements. It misses dynamic import() calls and require() statements, which are correctly handled in the primary removal check. This inconsistency might lead to missing undeclared dependencies that are only used dynamically.
| pattern = r"(?:from|import)\s+['\"]([^'\"./][^'\"]*)['\"]" | |
| pattern = r"(?:from|import|require)\s*\(?\s*['\"]([^'\"./][^'\"]*)['\"]" |
| continue | ||
| if pkg_name in { |
There was a problem hiding this comment.
The internal alias check only accounts for the exact string "@". Most TypeScript configurations in this project likely use "@/*" for path mapping (as hinted by the comment), resulting in specifiers like "@/components/...". These will currently trigger false positive "missing dependency" warnings because they don't match the exact "@" check.
# Internal aliases like '@/foo' or starts with builtin names
if pkg_name == "@" or pkg_name.startswith("@/"):
continue| if pkg_name in { | ||
| "node:fs", | ||
| "node:path", | ||
| "fs", | ||
| "path", | ||
| "url", | ||
| "stream", | ||
| "crypto", | ||
| "buffer", | ||
| "util", | ||
| "events", | ||
| "child_process", | ||
| }: | ||
| continue |
There was a problem hiding this comment.
The list of Node.js builtin modules is incomplete. Common modules such as os, http, https, and zlib are missing. Additionally, the check does not account for the node: protocol prefix, which is the modern standard for importing builtins. This can lead to false positive warnings during the hygiene check.
if pkg_name in {
"assert", "async_hooks", "buffer", "child_process", "cluster",
"console", "constants", "crypto", "dgram", "diagnostics_channel",
"dns", "domain", "events", "fs", "http", "http2", "https",
"inspector", "module", "net", "os", "path", "perf_hooks",
"process", "punycode", "querystring", "readline", "repl",
"stream", "string_decoder", "timers", "tls", "trace_events",
"tty", "url", "util", "v8", "vm", "wasi", "worker_threads", "zlib",
} or pkg_name.startswith("node:"):
continue|
|
||
| failures: list[tuple[str, list[Hit]]] = [] | ||
| for name in removed: | ||
| hits = find_usage(name) |
There was a problem hiding this comment.
Executing a full repository grep for every removed package is inefficient. If multiple dependencies are removed, the entire codebase is scanned repeatedly. This scales poorly, especially when using the --enumerate-dead flag which triggers a scan for every declared dependency (currently ~67). Batching the search into a single grep execution using a combined regex pattern for all target packages would significantly improve performance.
References
- To improve efficiency, avoid redundant data iterations. Combine checks and transformations into a single loop and return computed values for callers to reuse.
- To improve efficiency, avoid redundant file traversals.
Eight P1s and three P2s surfaced across 10 codex reviewers; this commit addresses all of them. P1s: 1. Workflow refspec. `git fetch --depth=1 origin <base_ref>` may only create FETCH_HEAD in shallow PR checkouts; the checker then dies with `fatal: invalid object name 'origin/main'`. Use the explicit refspec `<base>:refs/remotes/origin/<base>` so origin/<base> is reliably created. 2. `_deps_of()` was counting optional peer dependencies as reachable. npm only installs an optional peer when another package declares the same dep, so for "is this removed package still in the tree" they cannot keep it alive on their own. Skip entries marked `optional: true` in `peerDependenciesMeta`. 3. JS-syntactic classifiers (static_import, side_effect_import, dynamic_import, require, re_export, jsdoc_import, template_literal, tsc_triple_slash, new_url) now gate on file extension. Previously only the final string-literal fallback was gated, so a JS-shaped string inside a Python fixture or a Markdown code fence triggered a false FAIL. Added U37-U40 covering .py / .md / .sh / .yml. 4. HTML `<script src=>` and `<link href=>` patterns now respect a package-name boundary so `/node_modules/foo-extra/...` is not treated as a usage of `foo`. Added U41-U43. 5. New `find_command_usage()` detects CLI invocations in .sh / .yml / .yaml / .ps1 / .bat / Dockerfile* (npx pkg, bunx pkg, pnpm exec pkg, yarn dlx pkg, or a bare pkg --flag). Also covers scoped CLI packages exposed by their unscoped tail (@biomejs/biome -> biome). 6. `build_bin_to_pkg(head_lock)` was losing the bin -> package map for packages the PR correctly removed from the lockfile, so `scripts.biome:check` no longer flagged when @biomejs/biome was being dropped. Now also read the base lockfile (via `git show` or the new `--base-lock` override) and layer its bin map on top for any package in the removed set. 7. `--strict` now runs hygiene checks (lockfile sync, @types orphans, undeclared imports, dead-deps) on the no-removal path too. Previously the early return at "[OK] no dependencies removed" skipped them, so `--strict` silently passed on a tree with uncommitted lockfile drift or unused deps. 8. Removed `@types/X` packages are now matched against the runtime target name `X`: `/// <reference types="X" />`, tsconfig compilerOptions.types entries, AND runtime `import "X"` shapes. Handles the npm scope encoding (`@types/foo__bar` -> `@foo/bar`). P2s: 9. CSS `url(...)` now accepts both quoted and unquoted forms (added U44-U45). The previous regex required `/{pkg}/` after a slash, missing bare-package urls like `url(katex/fonts/x.woff2)`. 10. `find_imports_without_decl()` now covers all static-import shapes: `import "pkg"`, `import Foo from "pkg"`, `import { Foo } from "pkg"`, `import type { Foo } from "pkg"`, `await import("pkg")`, `require("pkg")`. 11. (Same as #8.) Removed `@types/X` is also linked to runtime imports of `X`, not just type-only references. Test suite expanded from 101 to 110 cases; all pass. Real-world enumerate-dead still flags the same 11 unused packages on studio/dep-removal-safety-check (matches PR 5477's removal set).
for more information, see https://pre-commit.ci
|
Addressing the 10x reviewer run (commit
Local verification: Real-world enumeration still flags the exact set PR #5477 was originally dropping, so the new gates don't introduce false positives on this repo. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8d1b21c0ed
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if first in {"npx", "pnpx", "yarn", "pnpm", "bunx"} and idx + 1 < len( | ||
| words | ||
| ): | ||
| idx += 1 | ||
| first = words[idx] |
There was a problem hiding this comment.
Parse pnpm exec and yarn dlx script wrappers correctly
Update the wrapper parsing in scripts_bin_refs() to skip subcommands before resolving the binary name. Right now the code advances only one token for pnpm/yarn, so pnpm exec biome check is interpreted as binary exec and yarn dlx eslint as binary dlx, which never map back to the removed package. This creates a false PASS when a dependency is only referenced via those script forms. (For example, pnpm help exec shows usage pnpm ... exec <command>, so the executable token is after exec, not immediately after pnpm.)
Useful? React with 👍 / 👎.
Three blockers from the parallel Opus review batch: 1. scripts_bin_refs ignored every script that began with a wrapper. The original "first non-env token wins" heuristic credited cross-env / dotenv / dotenvx / env-cmd as the bin, so a script like `cross-env CI=1 biome check` left @biomejs/biome looking unused. Rewrote into _next_real_bin(), which peels env prefixes, the leading package-manager runner (npx / pnpx / bunx / pnpm exec / yarn dlx), and the known wrapper bins (with --/-flag-arg handling) before returning the real CLI. shlex tokenization preserves quoted env values like `FOO="a b"`. 2. enumerate_dep_usage skipped find_command_usage. The non-enumerate path already credited deps used only from CI / Dockerfile / shell scripts, but `--enumerate-dead` did not, so packages referenced only from a workflow were silently listed as dead. Added the same call (gated against @types/* to avoid the unscoped-tail false positive). 3. classify multi-line window was ±4 lines. Prettier formats long named-import lists one identifier per line, so a 20-import block pushed the `import` keyword out of the window and the dep dropped to the string-literal fallback (or worse, was missed entirely). Widened to ±25 -- still bounded enough to keep false-positives negligible, wide enough for the realistic Prettier ceiling. Tests: added 10 _next_real_bin unit cases + 4 scripts_bin_refs end-to-end cases (W01-W10 + I01-I04) and a 22-identifier multi-line import adversarial case (A13). Full suite: 125/125.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 418b63b7cd
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # CLI invocations in shell scripts / workflows / Dockerfiles. | ||
| hits.extend(find_command_usage(name)) | ||
| # @types/X is "used" if X is referenced as a type or as a |
There was a problem hiding this comment.
Skip command-bin detection for removed @types packages
The removal loop always calls find_command_usage(name) even when name is an @types/* package, but find_command_usage() derives candidate binaries from the package tail (for @types/npm, it scans for npm). That makes unrelated workflow/shell prose or commands containing that word count as a usage and can incorrectly fail safe removals of type-only deps that are not runtime CLIs. In practice this produces false FAIL results for synthetic removals like @types/npm, so dependency cleanup PRs can be blocked even when no TypeScript/runtime reference exists.
Useful? React with 👍 / 👎.
|
Addressing the 4x Opus reviewer batch (commit Per the earlier direction, P1s + P2s only.
Tests: 10 The three P3 comment-accuracy items about Bun lockfile migration wording on #5479 were left for a follow-up — the defensible policy ("skip Bun when no |
Summary
studio/frontend/package.json.This was motivated by the recent #5438 (drop unused
nextdep) and #5477 (drop 11 more unused deps). Both are safe today, but only because every removal was hand-verified. This change turns that verification into a repeatable check.What the check catches
For every removed package, it greps the entire repo (every
.ts/.tsx/.js/.jsx/.mjs/.cjs/.html/.htm/.css/.scss/.sass/.json/.jsonc/.md/.mdx/.py/.rs/.toml/.yml/.yaml/.sh/.ps1/.bat/Dockerfile*outsidenode_modules,dist,build,target,.git,__pycache__,.venv) for these patterns, then feeds an 8-line window into the classifier so multi-line statements are also caught:import X from "pkg"/import { X } from "pkg/sub"(static imports, single or multi-line)import type { X } from "pkg"(type-only imports)import "pkg"(side-effect imports; CSS@importdistinguished via lookbehind)import("pkg")(dynamic imports)export * from "pkg"/export { x } from "pkg"/export type { Foo } from "pkg"(re-exports, single or multi-line)require("pkg")andrequire.resolve("pkg")@import "pkg"(CSS imports)<script src="...pkg">and<link href="...pkg">(HTML)/// <reference types="pkg" />(TypeScript triple-slash)new URL("pkg/...", import.meta.url)patternsurl(.../pkg/...)references@import("pkg")referencesvite.config,tsconfigpaths,eslintplugin arrays, shadcncomponents.json, biome configs, etc.)Word-boundary safeguards keep
foofrom matchingfoobar,lodashfrom matchinglodash-es, and@radix-ui/react-labelfrom matching@radix-ui/react-label-extra. Package names appearing only in plain-text comments or URL tails do not trip the detector.How the install-graph reachability works
After collecting source references, the checker resolves whether the package would still install after the removal:
package.json's declared deps through the lockfile dep graph.node_modules/<name>paths count as importable fromsrc/. Nested copies under another package are noted but flagged as not importable from bare specifiers.package.json, not the lockfile's<root>map, so a stale lockfile cannot give a falseOK-via-transitiveverdict.A
pip install playwrightreference in a Python file or workflow is filtered out so removing the npmplaywrightis reported correctly.Hygiene warnings (non-fatal unless
--strict)package-lock.json's<root>dep map disagrees withpackage.json(lockfile drift, runnpm install).@types/<X>is declared but<X>is no longer declared anywhere.importstatement instudio/frontend/src/references a bare specifier that has no matching dep declared in any field ofpackage.json.Test suite (
tests/studio/test_frontend_dep_removal.py)Three suites, 64 cases total, all passing.
Suite 1: 24 end-to-end edge cases (subprocess-driven, full pipeline)
next-themes(2 src imports, no transitive parent)@xyflow/react(root-only, src imports recipe-studio)katex(kept top-level by@streamdown/math,mermaid,rehype-katex)clsx(kept bystreamdown)react(kept by countless peers)@radix-ui/react-slot(kept byradix-ui+@assistant-ui/react)zustand(top-level via@assistant-ui/react; nested xyflow 4.x irrelevant)@huggingface/hub(5 src imports)tailwind-merge(kept bystreamdown)@hugeicons/reactfrom deps to devDeps@huggingface/hubAND@xyflow/reacttogetherdexie(multiple src imports)motion(20+ src imports)canvas-confetti(used in confetti.tsx)recharts(used in chart.tsx)js-yaml(kept by@eslint/eslintrc; @types/js-yaml orphan warning)node-forge(used in providers-api.ts)@tauri-apps/api(kept by all 5@tauri-appsplugins)mammothunpdfremark-gfm(kept bystreamdown)date-fns(kept byreact-day-pickerand@base-ui/react)Suite 2: 28 classify() unit cases (direct function call against hand-crafted snippets)
Covers each pattern shape: static import, side-effect import, dynamic import, require, css_import, html_script, html_link, re_export (4 variants:
*, named, type, multi-line), template_literal, new_url, tsc_triple_slash, jsdoc_import, string_literal. False-positive guards: substring collision (foovsfoobar), exact-prefix collision (lodashvslodash-es), scoped-pkg collision (@radix-ui/react-labelvs...-extra), plain-text comments, URL tails, Python files, markdown links.Suite 3: 12 adversarial end-to-end cases (write synthetic files, run full script, clean up)
Each case drops a synthetic file under
studio/frontend/src/__dep_check_adversarial__/, runs the script as a subprocess against a base that adds a synthetic package, and asserts the outcome. Then cleans up.import { ... } from "removed_pkg"export * from "removed_pkg"export { foo } from "removed_pkg"export type { Foo } from "removed_pkg"pkg_evspkg_e_extra)new URL("pkg/...")@import("removed_pkg")CI integration
studio-frontend-ci.ymlgets one new step onpull_requestevents:The script reads
package-lock.jsondirectly, so it completes in well under a second and runs beforenpm ci. The full test suite (64 cases) runs end-to-end in a few seconds.Manual usage
Exit codes:
0clean,1one or more removed deps still referenced and not reachable,2invocation error.Test plan
[OK] no dependencies removed.next-themesreturns exit 1 with file:line citations of every offending import.studio-frontend-ciruns the new step on this PR (and reports no findings since this PR removes nothing).